Project Summary

Women have traditionally embraced multiple roles, including mother, daughter, wife, sister, and friend- often by choice or necessity. In today’s world, the role of women has evolved significantly and positively. Women are now educated, self-reliant and financially independent. They have various avenues for personal fulfillment and joy. This project aims to explore these avenues to uncover the words and topics linked to women’s happiness. The analysis comprises of 4 sections- analyzing women’s happiness overall and comparing it between women across different life stages (married/unmarried), in diverse global regions (developed/undeveloped), and among various age groups (20s/30s/40+)

# Importing the necessary libraries
library(tm)
library(tidytext)
library(tidyverse)
library(devtools)
library(DT)
library(scales)
library(countrycode)
library(dplyr)
library(ggplot2)
library(NLP)
library(tibble)
library(topicmodels)
library(wordcloud2)
library(gridExtra)
library(ngram)
install_github("gaospecial/wordcloud2")
# Reading "processed_moments.csv" created in data_preprocessing.R (in lib folder)
hm_data <- read_csv("~/Desktop/processed_moments.csv")

# Reading the demographics file
demographics<-'https://raw.githubusercontent.com/rit-public/HappyDB/master/happydb/data/demographic.csv'
demo_data <- read_csv(demographics)

Happiness takes on various meanings for different women. In this analysis, I aim to explore these unique definitions by examining the words they associate with their happiness. The analysis aims to explore the diverse meanings of happiness for women and uncover the words they connect with this cherished emotion.

Section 1: What do women associate happiness with?

# making a datatable that consists of only female data. It is formed by combining hm_data with demo_data
all_women <- hm_data %>%
  inner_join(demo_data, by = "wid") %>%
  select(wid,
         original_hm,
         gender, 
         marital, 
         parenthood,
         reflection_period,
         age, 
         country, 
         ground_truth_category, 
         text) %>%
  mutate(count = sapply(hm_data$text, wordcount)) %>%
  filter(gender %in% c("f"))

datatable(all_women)
# Creating a bag of words using the text data
bag_of_words_all_women <-  all_women %>%
  unnest_tokens(word, text)

word_count_all_women <- bag_of_words_all_women %>%
  count(word, sort = TRUE)
Overview of Words that Women Associate Happiness With
# making a word cloud
wordcloud2(word_count_all_women, size = 0.6, rotateRatio = 0)

From the word cloud, it is evident that women derive happiness from various sources. The most prominent among these is the companionship of their friends, indicating the importance of social connections. Additionally, family members, including husbands and children, play a significant role in contributing to women’s happiness. The presence of words like “surprise,” “celebrated,” and “enjoyed” suggests that women also associate happiness with moments of celebration and enjoyment.

Top 10 Pairs of Words Women Associate Happiness With

ggplot(top_bigrams_all_women, aes(x = reorder(paste(word1, word2, sep = " "), n), y = n)) +
  geom_bar(stat = "identity", fill = "blue") +
  labs(x = "Pair of Words", y = "Frequency") +
  coord_flip() + 
  ggtitle("Top 10 Pairs of Words Women Associate Happiness With") +
  theme_minimal()+theme(plot.title = element_text(hjust = 0.5))

From the bigram visualization, it appears that celebrating birthdays is a significant source of joy for some women. This is followed by pairs of words that indicate family and friends. Let’s figure out the top word that was used to describe happiness among women.This means examining unigram frequencies and identifying the single word most commonly associated with happiness in all_women dataset.

Top 10 Words of Words Women Associate Happiness With

# Making a unigram of top 10 words
ggplot(top_words_all_women, aes(x = reorder(word, n), y = n)) +
  geom_bar(stat = "identity", fill = "dark green") +
  labs(x = "Word", y = "Frequency") +
  coord_flip() +  
  ggtitle("Top 10 Words Tied to Women's Happiness") +
  theme_minimal()+theme(plot.title = element_text(hjust = 0.5))

As anticipated, the unigram analysis reveals that “friend” and “family” are the primary elements that women associate with happiness.

Next, I perform Topic Modelling using Latent Dirichlet Allocation (LDA) to get Top 10 Topics Women Associate Happiness with.

Top 10 Topics Women Associate Happiness With
# Top 10 topics (with 10 terms each)  women associate happiness with
head(lda_terms_all_women, 10)
# plotting these topics into a bar chart
lda_topics <- topics(lda_model_all_women, k = 1)
all_women$topic <- as.factor(lda_topics)
ggplot(data = all_women) +
  geom_bar(stat = "count", aes(x = topic, fill = topic)) +
  scale_fill_discrete(name = "Topics",
                      labels = c("1. Family", "2. Food", "3. Children", "4. Job", 
                                 "5. Education", "6. Reading", 
                                 "7. Shopping", "8. Nature", "9. Celebration", "10. Entertainment")) +
  ylab("Number of Happy Moments") + xlab("Topics")+ggtitle("Top 10 Topics Tied to Women's Happiness") +
  theme_minimal()+theme(plot.title = element_text(hjust = 0.5))

Overall, women find happiness in family and friends. They also express interest in additional aspects of life, including job, shopping and nature, indicating their ongoing exploration of diverse sources of joy. But do these topics change when we subset women into various categories? The following sections will explore this. We start off by comparing the sources of happiness between unmarried and married women.

Section 2: Is there a difference between the sources of happiness between unmarried and married women?

Note that single, widowed and divorced women are considered unmarried for the purposes of the analysis.

Overview of Words that Unmarried Women Associate Happiness With
wordcloud2(word_count_unmarried_women, size = 0.6, rotateRatio = 0)

Friends continue to be a big source of joy. This is followed by words that resonate with themes of relationship, celebration and shopping.

Overview of Words that Married Women Associate Happiness With
wordcloud2(word_count_married_women, size = 0.6, rotateRatio = 0)

Words such as “husband”,“daughter” and “son” suggest that married women place high importance on personal relationships.

Top 10 Pairs of Words Unmarried Women Associate Happiness With

The bigram visualization highlights key themes of family, friends, and celebrations as central to women’s happiness. Additionally, unmarried women appear to derive personal fulfillment from activities like watching TV and job interviews.

Top 10 Pairs of Words Married Women Associate Happiness With

For married women, happiness primarily stems to be from family relationships.

Top 10 Words Unmarried Women Associate Happiness With

Top 10 Words Married Women Associate Happiness With

The unigrams reflect the same themes depicted on the bigrams.

Next, I perform Topic Modelling using Latent Dirichlet Allocation (LDA) to get Top 10 Topics Married and Unmarried Women Associate Happiness with.

Top 10 Topics Unmarried Women Associate Happiness With

Top 10 Topics Married Women Associate Happiness With

Overall, unmarried women tend to derive happiness from interesting topics like nature, education and job. On the other hand, children are a constant source of joy for married women. It also seems like married women have an additional appreciation for food. Despite these differences, both groups share the belief that family is the most significant source of joy.

The next plot is a scatterplot that compares happy moments of unmarried and married women

Word Proportion Scatterplot for Unmarried and Married Women

Section 3: Is there a difference between the sources of happiness between women in developed regions and those in undeveloped regions?

Note the definitions of ‘developed’ and ‘undeveloped’ regions are as per the United Nations website.

https://population.un.org/wpp/DefinitionOfRegions/

For this analysis, I’ll be using countrycode library to identify the continents each of female respondent belongs to. Following this, based on their continent’s economic status as per United Nations, female respondents are going to be split into developed and undeveloped categories.

##  [1] "USA" "DNK" "IND" "KWT" "FIN" "VEN" "CAN" "IRL" "GBR" "JAM" "ESP" NA   
## [13] "MEX" "ARM" "NGA" "PHL" "GRC" "LTU" "BGR" "TUR" "DZA" "IDN" "ZAF" "AUT"
## [25] "LKA" "PAK" "NZL" "SRB" "ETH" "PRI" "NIC" "NLD" "EGY" "AUS" "BEL" "DEU"
## [37] "ITA" "ASM" "THA" "UGA" "ARE" "JPN" "DOM" "UMI" "CYP" "PRT" "MYS" "FRA"
## [49] "BRB" "CZE" "BHS" "ISL" "SUR" "MKD" "TCA" "TTO" "SGP" "BRA" "ZMB" "AFG"
## [61] "TWN" "VIR" "SLV" "GTM" "NOR" "COL" "MDA"
##  [1] "Northern America"          "Northern Europe"          
##  [3] "Southern Asia"             "Western Asia"             
##  [5] "South America"             "Caribbean"                
##  [7] "Southern Europe"           NA                         
##  [9] "Central America"           "Western Africa"           
## [11] "South-Eastern Asia"        "Eastern Europe"           
## [13] "Northern Africa"           "Southern Africa"          
## [15] "Western Europe"            "Australia and New Zealand"
## [17] "Eastern Africa"            "Polynesia"                
## [19] "Eastern Asia"
Overview of Words that Women in Developed Regions Associate Happiness With
wordcloud2(word_count_all_women_developed, size = 0.6, rotateRatio = 0)
Overview of Words that Women in Undeveloped Regions Associate Happiness With
wordcloud2(word_count_all_women_undeveloped, size = 0.6, rotateRatio = 0)

Comparing the above 2 word clouds, it seems like women in developed regions find happiness in family, which has been a common theme. However, a notable difference is the prominence of the word “home” in the word cloud of women in developed regions. This may be attributed to the fact that women in developed regions often have the privilege of staying at home, whereas those in undeveloped areas may need to work diligently to support their families.

Top 10 Pairs of Words Women in Developed Regions Associate Happiness With

Top 10 Pairs of Words Women in Undeveloped Regions Associate Happiness With

Both the visualizations highlight the observation made above- women in developed regions can afford to stay at home and find personally fulfilling activities. For women in undeveloped regions, this is not an option. Hence, their happiness is sourced from offsite activities at work.

Top 10 Words Women in Developed Regions Associate Happiness With

Top 10 Words Women in Undeveloped Regions Associate Happiness With

Let’s proceed with Topic Modelling using Latent Dirichlet Allocation (LDA) to get Top 10 Topics Women in these regions Associate Happiness with.

Top 10 Topics Women in Developed Regions Associate Happiness With

Top 10 Topics Women in Undeveloped Regions Associate Happiness With

Overall, women in developed countries often find happiness in home-based activities and have the means to enjoy vacations with their families. In contrast, women in undeveloped countries, while also valuing shopping, may not have the same opportunities for extravagant vacations.

The next plot is a scatterplot that compares happy moments of women in these two regions

Word Proportion Scatterplot for Women in Undeveloped v/s Developed Regions

Section 4: Is there a difference between the sources of happiness of women belonging to different age groups?

Now, we will split the data into 3 groups: twenties (women of age 20+ but less than 30), thirties (women of age 30+ but less than 40) and forties_and_over (women of age 40+ but less than 100). Please note we are not considering teenagers in this analysis since most of the topics like job, children won’t apply to them. Additionally, there are some outliers in age category: values like 233. Hence, the upper limit for age is 100. The overall range is (20-100).

##  [1]  30  28  55  41  25  32  38  47  79  29  64  34  26  61  27  33  31  23  44
## [20]  22  21  35  45  46  62  65  36  39  52  53  42  57  43  20  24  48  37  51
## [39]  49  66  58  56  60  50  54  18  40  19  71  98  NA  70 227  68  59  69  67
## [58]   2  63  88  77  74  81 233  75  95   4  72  80  84  17   3  73
Overview of Words that Women in Their 20s Associate Happiness With
wordcloud2(word_count_twenties, size = 0.6, rotateRatio = 0)
Overview of Words that Women in Their 30s Associate Happiness With
wordcloud2(word_count_thirties, size = 0.6, rotateRatio = 0)
Overview of Words that Women 40+ Associate Happiness With

Through the progression of the word clouds, it becomes apparent that the word “friend” appears less prominently as women transition from their 20s to older age groups, with increasing emphasis on family. Additionally, celebrations seem to decrease in significance as middle-aged women prefer spending time at home with their children.

Top 10 Pairs of Words Women in Their 20s Associate Happiness With

Top 10 Pairs of Words Women in Their 30s Associate Happiness With

Top 10 Pairs of Words Women 40+ Associate Happiness With

In the bigram visualizations, it’s evident that as women age, their personal celebrations, such as birthdays and college journeys, are gradually replaced by those of their children. This shift suggests that women ultimately find more joy in celebrating their family’s milestones as they age.

Top 10 Words Women in Their 20s Associate Happiness With

Top 10 Words Women in Their 30s Associate Happiness With

Top 10 Words Women 40+ Associate Happiness With

Next, I perform Topic Modelling using Latent Dirichlet Allocation (LDA) to get top 10 topics women of different age groups associate happiness with.

Top 10 Topics Women in Their 20s Associate Happiness With

Top 10 Topics Women in Their 30s Associate Happiness With

Top 10 Topics Women 40+ Associate Happiness With

Here, we observe interesting series of trends that reveal previously unexplored topics. Women in their 20s appear to find happiness in food, while those in their 30s lean towards shopping, and those aged 40 and above seem to derive joy from nature.

Conclusion

  1. Across different age groups, marital statuses, and geographic locations, family emerges as the primary source of happiness for women, highlighting the crucial role of family support and strength in women’s lives.
  2. Besides family, women also find happiness in various other topics, including shopping, celebrations, and nature.
  3. For married women, their children become a constant and significant source of happiness. The transition from boyfriends to husbands marks a shift in the sources of joy, with activities like cooking and sharing meals with their husbands often bringing peace and contentment.
  4. Notably, the key distinction in the sources of happiness between women in developed and undeveloped regions lies in privilege. Women in less developed areas tend to find happiness in hard work, while their counterparts in more developed regions can enjoy the luxury of staying home and even taking vacations. Despite these disparities, both groups share a common appreciation for shopping, adapted to their financial circumstances.
  5. When examining women across different age groups, a clear trend emerges: personal celebrations give way to those of their children as they age, emphasizing the increasing importance of family and spending quality time with loved ones.